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OF ITEM ANALYSIS 
WILLIAM W. TURNBULL 


College Entrance Examination Board 


It is often desirable to obtain, after the administration of an 
objective test, statistical data concerning the individual test 
items. Many techniques of evaluating test questions have been 
devised. Most of the methods are intended to reveal the diffi- 
culty of each item, and the relationship between each possible 
answer and excellence in the function being tested. The method 
herein presented provides both a pictorial representation and 
numerical indices of item difficulty and of the item-criterion 
relationship. 

The ‘normalized graphic’ method may most readily be 
explained by reference to Figure 1, which shows an item analysis 
sheet of the type in experimental use at the College Entrance 
Examination Board. 

At the right of the sheet is the graph representing a single 
multiple-choice item in a test of achievement in English litera- 
ture. Frequency (expressed in per cent) of selection of each 
answer cption is plotted on the ordinate, while the six divisions 
of the abscissa represent successive levels, of ability in English 
literature (as indicated by scores on the test in which the item 
appears). Each line graphed represents one of the options 
offered. Consider, for example, the question given at the bottom 
of the analysis sheet. The first option, “orders Macduff’s family 
killed’’ (the correct response), is represented on the graph by the 
dash-dot line which slopes upward from left to right. The 
upward slope results from the fact that the left side of the graph 
shows the per cent of the poorest students of literature who chose 
this answer, while the right side of the graph shows the per cent 
of the best students who selected it. This differentiation is 
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achieved as follows: The students are first ranked in order of 
literary ability (inferred in this case from their total scores on 
the test under analysis); then the group is divided into sixths on 
the basis of ability. The per cent of the students in the poorest 
sixth (Group 1) who chose the first option is plotted on the 
graph. Similarly, still considering the first response, points are 
plotted for each successive sixth of the students until the last 
(Group 6) is represented on the graph; then the points are joined 
by a line. The remaining responses are graphed in the same 
manner. In the case of the correct response, it is hoped that 
more good students than poor ones will select it—i.e. that there 
will be higher percentages plotted near the right of the graph 
than near the left, and hence that the response line will slope 
upward. For the incorrect responses the lines should be negative 
in slope. An item is in need of revision unless the line for the 
right response shows a considerable rise towards the right and 
the line for each incorrect choice shows a decline towards the 
right. 

The popularity of each response may be estimated from the 
height of the line which represents it. In the graph shown 
(Figure 1), line 1 (the correct response) is seen to have attracted 
more candidates than any of the other options. Each choice 
should attract an appreciable percentage of the candidates; and 
in consideration of this fact response 5 should be made more 
plausible, since the line representing it lies so close to the bottom 
of the graph. The actual percentage of people in each score 
group who selected each option is given in the table at the left 
of the graph. The percentage of all people choosing the correct 
option is entered below the table and provides an index of the 
difficulty of the item. 

Thus the most important information about the item can be 
determined rapidly by inspection. For quantitative description 
of the item’s value in distinguishing between good and poor 
students, however, it is desirable to know the coefficient of cor- 
relation between choice of the right answer to a given question 
and ability as inferred from total score or some other criterion. 
This correlation is given by the sine of the angle formed by a 
horizontal line and the straight line which most nearly fits the 
six points representing the correct response. (For derivation 
of the sine formula for the coefficient of correlation, see below.) 
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ITEM: 





6. When Macbeth hearse that Macduff has fled 
England, be 
(1) orders Macduff's family killed 
2) sete out in pursuit of Macduff 
3) seeks the advice of the witches 
(4) orders Rose to bring Macduff back 
(5) coummite suicide 


FIGURE 1 
Analysis of a sample {tem in English Literature 
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The fit of the straight line may be made by inspection, the best- 
fitting (least squares) line drawn on the graph and the sine read 
from a protractor made for the purpose. This method provides a 
visual representation of the best guess as to the actual relation- 
ship between item and criterion, assuming the relationship to be 
linear. Its disadvantages are the difficulties of training per- 
sonnel in fitting the line by eye and the inaccuracies of the fit even 
with trained personnel. 











FIGURE 2 


x- and y-values used in computing correlation: coefficients 


A set of special scales makes it possible to obtain quickly the 
sine of the angle which would be formed with the abscissa by the 
line giving the perfect least squares fit, by first finding the slope 
of this line. According to the least squares formula the slope 


of. the, best-Sitting straight line = ru. In ‘this fraction the 


denominator 22? is a constant since the z-values are constant. 
The numerator, Zzy, may be found by measuring the three 
y-values shown in Figure 2, multiplying by the three correspond- 
ing (and constant) z-values and summing. 





1 This is essentially a technique of pairing the y measures which have the 
same z-multipliers: e.g. y: is actually the distance from point A to M, plus 
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In practice, it is convenient to use a different scale in measuring 


each of the three vertical values, each scale computed in such a 


-—_* Zi Ze Z3 
way that the y value found is pre-multiplied by Ex? Bz? " 52 


whichever is appropriate. Then a simple summation gives the 


fraction =, which is the slope of the best fitting line. But the 


slope of the line is simply the tangent of the angle which is formed 
by the line itself and any horizontal line, and by reference to a 
table of the trigonometric functions, the correlation (sine of the 
angle) may be read. The three scales used at the College 
Entrance Examination Board are reproduced in Figure 3. In 
their construction, allowance was made for the coarse grouping 
along the x-axis; each scale was enlarged by 1.045 (an approxi- 
mation) to counteract the reduction in r’s obtained when one of 
the variables is divided into only six groups. This reduction, if 
uncorrected, would amount to .0045 for a correlation of .10, or 
to .043 for a correlation of .90. 





DERIVATION OF SINE FORMULA FOR COEFFICIENT OF 
CORRELATION 


The sine formula for the coefficient of correlation is appropriate 
when the graph is so constructed that the standard deviation of 
the abscissa and of each column equals one. The formula may 
be derived as follows: 

The six score groups constitute the columns or y-arrays of the 
correlation surface. It is assumed, for each column, that ability 
in the y-variable is distributed normally about the mean of the 
column, and the further assumption is made that the standard 
deviations of all six columns are equal. Each percentage is 
actually plotted as if the mean of the column were on the fifty 
per cent line. When the points are joined, however, the resulting 
line will be identical with the line which would join the column 
means if the percentages in all columns lay along the fifty per cent 
line and if each mean were placed with respect to its own sigma- 
distance from the fifty per cent line. But the standard devia- 
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the distance from M, to point F. As A and F have the same z-deviation 
value (i.e. are the same distance from the mean on the z-axis), zy for these 
two points is 2:{(M, — A) + (F — M,)} or a: (fF — A). Similarly, for 
points B and EZ, 2zy = 2; (FE — B); and for C and D, 2zy = z; (D — C). 


3 #7 ee apa Pa # i planers =a - 
iy A OOS Boy we ee 


St 
t ¢ 
a 
a 





oe 
oe 


oy 





The Journal of Educational Psychology 





a. “ae ee . 
-esee ee weonwe © 9S 


Reed height 
of & in 
Grow 6 


REE BAE ET 





horisontel 





Set on & point Line for 
for Group 1 Group 6 


Scale for use with euter points: 
qo 1 - Gor 6 

















Read height 
of & in 
Group 4 








TTT IT TITY TY 


horisentel / horisontel 








‘Set on & point Line for Set on & point Line for 
fer Group 2 Group § for Group 3 Group 4 


Scele for use with siddle points: Scele for use @ith inner points: 
@orp 2 - Gor $ Group 3 - Group 4 




















FIGURE 3 


‘Scales used in computing correlation coefficients 
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tion of each column equals unity, since the ordinate values were 
normalized and the sigmas of ordinate and abscissa were equa- 
lized. We know that the standard deviation of a column is 
given by the formula 


Tcolumn = Ty Vvi-4r 


or 
l 


3s ——_- 
1 — r, 


Ty 


Similarly, the ‘normalized’ abscissa has unit standard deviation, 
i.e. 


Substituting this value in the formula for the slope of the regres- 
sion line 


Cg 
Dey = Try 7 
*,* rv lif 
or, writing the slope as tan ¢ 
tan = Tay | 


Squaring and substituting from (1) above gives 
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METHODS OF COMPUTING PER-CENT VALUES TO BE PLOTTED 


All percentages are based on the number of candidates who 
reached each item (N;, meaning number who tried item) rather 
than on the number in the group. JN, is taken as the number of 
candidates who gave answers for the item being analysed or for | 
at least one question which appears later in the test. It may be ne 
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COLLEGE ENTRANCE EXAMINATION BOARD 
WORK SHEET FOR GRAPHIC ITEM ANALYSIS 
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Work-sheet for obtaining number (N;) in a given score group who tried each item. 
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obtained most easily by tallying by hand, for a given sixth of the 
papers, the number of individuals for whom each item is the last 
one answered. The cumulative total (beginning at the last item) 
of the number stopping at each item is also the number who 
reached each item. 

The worksheet used at the College Entrance Examination 
Board in obtaining N; for each item is reproduced in Figure 4. 
Inspection of the figure reveals that the worksheet was used with 
a test containing twenty questions; that eighty-one people com- 
prised the group (Group 3) for whom last responses were tallied; 
and that forty-three of these people finished the test. The 
values for N; thus obtained are transcribed onto the second work- 
sheet, shown in Figure 5. Here also is recorded the frequency 
with which each response is chosen in each of the six score groups. 
The frequencies may be obtained by a hand tally, by utilizing 
punched cards and a tabulator, or, for machine-scorable tests, by 
employing the graphic item counter of an IBM test scoring 
machine. The sum of the response frequencies (22) within a 
given group is recorded under N; fer the group. The difference 
between this sum and WN, is the number who reached the item 
(as demonstrated by the fact that they answered at least one 
question appearing later in the test) but did not record an answer 
for it. This difference is written in the top line of the box for 
the item, opposite 0 (for ‘Omitted item’). 

The frequencies on the worksheet shown in Figure 5 are then 
converted to percentages, with N; for each group as the base, and 
the percentages are entered in the table to the left of the graph. 


PLOTTING THE POINTS 


It is convenient in graphing to use a standard and distinctive 
type of line to designate responses appearing in a given position 
(first choice, second choice, etc.). This code can be shown in the. 
table beside the graph. In plotting the points some care must 
be used in interpolating between values indicated on the scale, 
because of its expansion at the ends. 


TIME ESTIMATES 


The time required to complete an analysis will depend on the 
number of items, the number of responses per item, the number 
of cases and the computational aids available. When the 
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FIGURE 5 


Work-sheet summarizing frequency of choice of each 
possible response in each score group 
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test population is five hundred and the items are of the five- 
response variety, the over-all time required for the analysis will 
be approximately: 


30 minutes per item when graphic item counter is used to 
tally frequencies. 

35 minutes per item when punched cards and tabulator are 
used. 

50 minutes per item when hand tallying is used. 


If the test population is reduced in size by one-half, the time 
needed is reduced by only about one-third. Thus the method 
is seen to be fairly time-consuming. 


SHORT CUTS 


The time required may be reduced considerably if certain 
modifications of the full method are made: 

(a) Percentages may be based on 14N, or on the total number 
of responses to the item (Zz), rather than on N, (defined as the 
number responding to the item or to one appearing later in the 
test). If all candidates try all items, 4%N, Ze and N; are all 
equal. If some of the candidates mark no answer for an item, 
however, Zz will be lower than 14N, and if some of these same 
candidates answer items which appear later in the test, Ze will 
also be lower than N;. Ordinarily in a speeded test %N > N, 
> Lr, since 


16N = N, + number who have dropped out before reaching 
the item under consideration. 

N, = 22 + number who have not answered the item under 
consideration but have answered at least one item appear- 
ing later in the test. 


Arguments can be advanced for the use of any one of the three 
bases for percentaging. WN; is the most difficult statistic to com- 
pute; three to four minutes per item may be saved by substituting 
=e for N;, and perhaps five minutes per item by substituting 
EN for N:. 

(b) The two middle-score groups may be omitted entirely 
from the analysis. Their effect on the correlation is ordinarily 
slight, and the reliability of r is not greatly reduced by their 
omission. The disadvantage is the incompleteness of the picture 





‘ah 


4 
, 
r 
9? id 
* 
* 
ee 
Ss 
Jeg , 
~ 
a 
FY a. 
a 
- 
ae 
4 
- 
ad 
* 
.< 
eo" 
Py 
ee 
14 
44 


<thy Sarge z 
yaa 


> 
2) ete Yr tentaeae  ee 


= 


r + 
PETER, pet re A he ae 
Be mx ee San. ae 1 


. Stiae ~~ 5 . 


*-. 
1p. “ 
Re yt 


> A Se a he 
Tee, CEs Sete + Hz 





140 The Journal of Educational Psychology 


gained of the item—only two-thirds of the candidates are repre- 
sented in the analysis. The saving in time is almost proportional 
to the drop in N, amounting to nearly one-third of the time 
required for the full analysis. 

(c) If the population on which the analysis is based is small 
(under five hundred cases), it is unwise to omit more than the 
two middle groups, because of the unreliability of the resulting 
statistics. If the time available for analysis is very limited, 
however, the graphic method may be used with only two groups— 
the upper and lower quarters being convenient and statistically 
advantageous segments to employ. The chart should in this 
case be modified to include only two vertical lines, located 
slightly closer to the center of the chart than those representing 
the upper and lower sixths in Figure 1. The method then 
becomes a graphic representation of the analysis procedure 
advocated by Flanagan?, with the addition of information about 
the incorrect responses. If only two points are plotted the corre- 
lation may be read immediately from a protractor constructed to 
measure the sine of the angle between the line joining the two 
points and any horizontal line. The time required to complete 
the analysis is approximately half that involved in the full 
normalized graphic method; the standard error of the correla- 
tions, about one-half again as great. 


EVALUATION OF THE NORMALIZED GRAPHIC METHOD 


The desirability of the method must be decided after comparing 
it with alternative techniques of analysis. The alternatives are 
numerous. 

The usual graphic techniques differ from the one here described 
in that the axes of the graph are not normalized. This means 
that it is difficult to determine the coefficient of correlation 
between the item and the criterion selected—a considerable 
disadvantage when items are to be compared. 

The technique of finding a biserial correlation coefficient 
between item and criterion is one of the most widely recognized 
methods of analysis, and may be completed in approximately 
eighty per cent of the time required for the normalized graphic 





? Flanagan, J. C., ‘‘General considerations in the selection of test items 
and a short method of estimating the product-moment coefficient from data 
at the tails of the distribution.” J. Educ. Psychol. 30, 1939, 674-80. 
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method. The biserial technique could, of course, be adapted to 
graphic presentation, (in which it would probably be somewhat 
more time-consuming than the ‘normalized graphic’ method), 
but since the biserial method entails the determination of only 
two points for each response, it provides no indication of the 
type of curve which depicts the relationship between item and 
criterion. It does, however, make available the standard devia- 
tion of the criterion scores of the group trying each item—a figure 
which is often valuable in interpreting the correlation coefficients 
for items which were not tried by the entire test population. 

A correlation coefficient based on only the upper and lower 
segments of the test population can be presented graphically on a 
simplified chart as explained above under ‘short cuts.’ Insofar 
as this method involves reducing the number of cases available 
for analysis, the stability of the measures obtained is reduced. 
Moreover, non-rectilinear relationships between item and 
criterion are not detectable since only two points on each line are 
found. This method has the very real advantage of speed, 
however, and will probably be preferred where a quick and 
approximate index is all that is required. The same may be said 
of other moderately reliable ‘short’ methods of item analysis. 
(See Lindquist and Cook,* and Lentz, Hirschstein and Finch‘). 

In summary, the chief advantages of the normalized graphic 
method are the detailed information which it gives about each 
option, and the fact that it reveals non-rectilinear relationships 
between item and criterion when such are present. It will be 
found particularly useful when the goal of the analysis is item 
revision, and when the added time required to secure the detailed 
information is not a major consideration. 





* Lindquist, E. F., and Cook, W. W., ‘‘ Experimental techniques in test 
evaluation.” J. Ezp. Ed. 1,.1933, 163-185. 

‘Lentz, T. F., Hirschstein, B., and Finch, F. H., ‘‘ Evaluation of methods 
of evaluating test items.” J. Educ. Psychol. 23, 1932, 344-350. 
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TOWARDS INTRINSIC METHODS IN TESTING 
A. 8. LUCHINS 


Department of Psychology, Yeshiva College, and Graduate Faculty, 
New School 


AND 


E. H. LUCHINS 
Department of Mathematics, Brooklyn College 


Considered among the notable achievements of the scientific 
study of educational measurements are the short-type, objective 
tests and the standardization of grading procedures. The 
emphasis on the advantages of objective tests and objective 
norms, sheerly by the amount of time and space devoted to them 
in current educational courses and texts, (e.g., ! and *), tends to 
overshadow their limitations. This paper will deal with a number 
of these limitations and offer some suggestions concerning the 
use of tests in the school. 

Included in most lists of advantages of objective tests are: 1) 
the ease with which they can be scored, and 2) the breadth of 
subject-matter which can be covered in a single test situation. 
These considerations of convenience, although important to the 
teacher, do not constitute a sound pedagogical basis for the 
employment of such tests, since they are not intrinsically related 
to the learning process but pertain rather to administrative 
problems, such as the size of the class and the time allotted to 
testing. 

That objective examinations are admirably suited to test for 
recall or recognition of definite items is true, but is not an unmiti- 
gated blessing. In the short-type test the student gives and the 
teacher scores only the answer, but not the process which led to 
the answer. Thus little information is obtained regarding the 
student’s mental behavior.) But one may arrive at an answer 
other than the accepted oné, due to some minor error in an other- 
wise proper process of thinking, or one may arrive at the accepted 
answer in spite of an illogical, incorrect process. Moreover, the 
emphasis on the answer per se may discourage the search for 
original and clearer ways of arriving at the answer. Some stu- 
dents memorize the process of thought given by the teacher or 
text or even ignore or forget the process. They learn to associate 

142 
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together certain words and phrases. To ponder over the ques- 
tions is not advisable, for one may not have sufficient time to 
answer the many questions which usually are included in short- 
type examinations; besides, it may lead to misinterpretations. 

Objective examinations are structurally not well adapted to 
test for organization, integration or application of what was 
taught.) Frequent employment of these tests may develop 
attitudes in the student which foster the memorization of fixed, 
isolated items and so stereotype the student that he does not 
seek, and may even fail to see, the application of what he has 
learned to other situations (?, 90-93). 

There are educators who admit that the thought processes 
involved in obtaining the response, and the organization, integra- 
tion, and application of what was learned, may be evaluated by 
so-called essay examinations. However, many teachers shy 
away from the use of such tests on the grounds that they are more 
difficult to score and that they are subjective. The first objec- 
tion is a consideration of convenience; the second is the criticism 
most often presented against essay tests. In support of this 
criticism experiments are cited in which an answer to an essay 
question, when graded by various teachers,’ or even when 
remarked by the same teacher,‘ is given different grades. These 
data are tacitly considered to be proof that essay tests are 
subjective. 

It should be kept in mind that the essays in these studies were 
scored without any fixed standards, whereas short-type examina- 
tions are usually scored with an answer-sheet or key. Suppose a 
short-type test, say, one in educational or social psychology, was 
given to various experts in the field. If they did not have the key 
before them, and if they did not know the text or viewpoint on 
which the test was based, they might well disagree on the scoring 
of certain items. On the other hand, suppose that the reader of 
an essay examination had before him a definite key; for example, 
one which was arrived at in the following manner: the experi- 
menter answers the questions, outlining the main factual items 
and concepts; then, on the basis of the objectives of the course, he 
allots various credits to— 


1) the main factual items and concepts 
2) the organizational features of the answer 
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3) facts brought in from supplemental readings 
4) such formalistic features as spelling and grammar. 


An outline or key of this sort might lessen the disparities in the 
scores given to an essay, particularly if all the readers had dis- 
cussed the key in advance and had achieved mutual understand- 
ing concerning it.* In brief, what the cited studies might be 
showing are not the inherent objectivity of the short-type test 
and the inherent subjectivity of the essay test, but rather the 
effects of grading papers with or without a pre-established, detailed 
frame of reference. 

One who realizes that the usual arguments for the employment 
of short-type and against the employment of essay tests are not 
conclusive, might ask: ‘‘ What should determine the type of test 
to be given?”’ The answer is: The nature of the course. If the 
course dealt with highly controversial or rather undefined sub- 
ject-matter, if its aim was the organizing and synthesizing of facts 
and theories, there is little point in giving the short-type test; 
if no attempt was made to integrate the various facts and skills 
taught, then it is unfair to expect students to do so in an essay 
test unless the specific aim is to see how well they can organize 
the material without having been taught to doso. Before decid- 
ing on the type of examination and its contents, the teacher might 
ask: what are the objectives of the course; what main points, 
facts, skills, and insights did the subject-matter to be tested aim 
to impart to the students; what test will best measure what is 
desired to be measured? The final result need not consist of all 
essays or all short-type questions, but may be a combination of 
the two. Moreover, it may be that the outcome of instruction 
can be evaluated as well—or even better—by non-written 
examinations, by a recitation period or practical test. 





* Cf. this method with the usual extrinsic procedures suggested for 
improving the validity and reliability of grading essay examinations. For 
example: After reading the answers to questions, ‘‘place the papers in about 
five piles representing levels of merit. In a typical class there should be 
about ten per cent in the highest and lowest pile; about twenty per cent in 
the next to the best and next to the poorest; and about forty per cent in the 
middle or average group. . . . Reread the answers to these questions and 
shift those papers which seem to be out of place”’ (' p. 599). Note the two 
underlying assumptions, both of which can stand critical inspection: 1) 
objectivity is increased by marking the papers on a relative scale, and 2) 
the grades should follow a normal distribution curve. 
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In an attempt to standardize grading procedures, teachers have 
been advised to set up frequency distributions of the students’ 
scores on a test and to use deviations from the mean or medium, 
or some other statistical pattern, to determine the various cate- 
gories of achievement, e.g., A, B,C, Dor F. Whenever feasible, 
test norms should be based on a large representative population. 
The underlying assumption is that the abilities of the students 
vary in accordance with a normal distribution curve and that the 
larger the population studied, the closer will the distribution of 
grades approach the Gaussian curve. 

This method of grading gives the standing of an individual 
relative to all others who took the same test but does not neces- 
sarily reflect what that individual has achieved relative to the 
requirements of the course. Evaluations of the latter type are 
necessary for an appraisal of the efficacy of the educational proc- 
ess. The writers have found the following a useful grading 
procedure on the college level: 

The minimum that a student ought to obtain from the course is 
determined. This is done on a practical basis, e.g., on the basis 
of what is needed in the next course. On these minimum require- 
ments, a hierarchy of facts, skills, abilities and achievements is 
built with specific grades assigned to various levels of functioning; 
e.g., F or failure may represent the level of not even being able to 
achieve the minimum requirements; D—the level of merely 
reproducing what was taught, just meeting the minimum require- 
ments; C—reproduction together with some integration and 
application of the material, B—integration and application to 
situations not obviously similar to the ones used in class; and A— 
the level of making new inferences, worth-while suggestions, and 
of being able to apply the principles taught to new and different 
situations. 

To overcome the possible subjectivity of the individual teacher 
in the establishment of the requirements and levels, these should 
be decided upon by the supervisors of the educational system in 
consultation with those in the school or school district who are 
responsible for the setting up and teaching of the course. 

The present stress on tests as grading devices tends to over- 
shadow their diagnostic function. Tests should be given not 
merely to grade, but also to determine the efficacy of the teaching 
process; to find out what the individual student has mastered; to 
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diagnose his difficulties; and to help in the establishment of a 
remedial program. Consider the case of the student who, were 
he graded on a relative basis, would have a high score because the 
class as a whole did not do well. If the test reveals that the 
student did not master the minimum requirements, or did not 
reach as high a level as he is deemed capable of, then, regardless of 
whether his standing is relatively high or low, a remedial program 
should be set up for him. The test should not be viewed as 
closing the door on the learning process. 

There are times when a knowledge of the student’s behavior 
during the examination period is necessary for an intelligent 
analysis of his responses. Therefore, whenever possible, the 
teacher of the course should proctor the examination. If this is 
not feasible, the proctor should be careful to note and to report 
to the instructor any unusual behavior manifestations. The 
attitude of the proctor is important not only to insure compliance 
with the examination rules, but, also, in order not to create a 
fear situation. Neither stern suspicion nor lax inefficiency is 
desirable, but friendly assistance to see that the materials for 
taking the test and understanding its instructions are available 
to the students. 

In some classes the entire course is geared to the taking and 
passing of tests. After every chapter, every lecture, every unit, a 
test is given. Under these circumstances the teacher may direct 
his teaching to prepare the class for tests. The student may learn 
in order to pass tests, thus developing a superficial relationship to 
the subject-matter—learning it for the sake of a test grade. 

In addition to fostering this extrinsic relation between student 
and subject-matter, the emphasis on grades makes competitive 
‘grade-grabbers’ of some students, and the emphasis on examina- 
tions as grading devices makes the test situation an ordeal for 
many. Even on the college level it is not uncommon to find that 
the atmosphere in an 'examination-room is charged with tension, 
anxiety, and nervous haste. In one college where there usually 
was a wave of anxiety during final examination week, a survey 
was conducted to determine possible causes of this test atmos- 
phere, The following factors were noted: there were too many 
examinations within a short period of time; the final examination 
score weighed heavily in determining the term grade; the instruc- 
tor liked to prepare impressive tests which contained tricky 
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questions or material not covered in class; some tests were too 
long for the time allotted; the instructor marked on the basis of 
a curve and gave just so many ‘A’s’ and so many ‘F’s’—some 
students were anxious not to lose their A’s and others dreaded 
getting the F’s which must be given. 

A number of the suggestions made as a result of this particular 
investigation have general bearing. Examinations should be so 
spaced that a student is not overburdened by successive hours of 
testing; tricky questions should be avoided (one such question 
may so upset the student that he may fail to answer succeeding 
questions with which he would otherwise have been able to cope); 
no one test should be permitted undue eight in determining the 
final grade; the length of the test should be commensurate with 
the time allotted. 

This last factor is an important one. That working under 
‘speed-up’ conditions has a deleterious effect on thinking was seen 
in an experiment, involving arithmetical problems, which was 
administered to college and public school classes (?, 53-56). In 
contrast to the rather carefree atmosphere which had existed in 
college classes when ample time was allotted per problem, the 
students now were obviously strained; they spoke of rushing, and 
complained that their nervousness prevented them from thinking 
and being accurate. In the public schools ‘‘great anxiety, haste, 
and competition were observed; faces were strained, pencil points 
broke, many children moaned and groaned and a few even wept. 
All comments told of their being fearful, worried, upset, and some 
dramatically proclaimed that they were so frightened they 
thought they would die; . . . . that their minds were in a con- 
stant whirl, and that they hoped never to get such a test again”’ 
(?, 55). Thus, when a student takes an examination which 
necessitates his working under pressure of time, the final result 
may not be at all indicative of what he is capable of doing in a 
less trying situation. 

Perhaps what is needed to combat the superficial attitude 
toward learning, the competition for grades, and test-tension, is 
the adoption of a more wholesome attitude towards grades and 
tests. An educational program should be established for stu- 
dents, teachers and parents, to attempt to decrease the competi- 
tion for high grades and honors. Stress should be placed on each 
student functioning at his optimum level:, Tests should be 
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regarded primarily as diagnostic devices to determine this level, 
and only secondarily as grading devices. As long as one functions 
well at his level, can apply what he has learned, and is socially 
adjusted, that is reward enough for learning. 
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SUMMARY 


In anticipation of an eventual boom in school building con- 
struction during the late 1940’s, architects and educators should 
work together in the interests of more effective classroom teaching 
to design structures which will be as free as possible from unwanted 
noise. 

A search through most of the pertinent sources of the last 
thirty years reveals surprisingly few classroom studies of the 
‘controlled experiment’ type on the changes in the behavior of 
pupils of school age (five to eighteen years) under conditions of 
relative ‘quiet’ vs. ‘noise.’ The great weight of the evidence, 
however, indicates that performances which are prized indi- 
vidually and socially tend to be reduced in various degrees in an 
auditory environment marked by annoying and distracting 
sounds. Indirectly, sudden loud noises produce fear reactions; 
repeated distracting noises evoke angry responses; neither state is 
favorable to learning. The efficiency of all kinds of mental 
work, especially the more complex varieties, is generally notice- 
ably lowered. This is shown by these signs: (1) increased test 
errors; (2) decreased speed; (3) complaints of headache, nausea, 
fatigue, irritability, and allied unpleasant organic sensations; (4) 
decreased number of stomach contractions and lessened gastric 
flow; and (5) small but consistent increases in metabolic rate, 
muscle tension, heart rate, higher brain and arterial pressure 
(especially the systolic phase), and greater breathing rate and 
volume—all indices of an organism laboring under heavy pressure 
and compensating with effort to meet the unusual demands 
placed upon it. 

Comparable researches in industrial hygiene on adult workers 
suggest that while the human organism exhibits remarkable 
powers of adjustment to initially objectionable auditory stimula- 
tion, such adaptation is evidently bought at a price, i.e., the 
‘physiological cost’ is increased as symptomized by an eventual 
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lowering of the threshold for emotional instability with its accom- 
panying heightened susceptibility to misconduct. It is generally 
agreed that the ‘intensity’ of the noise is less significant than the 
‘kind’ of noise involved, i.e., its meaning as an interruption or 
distraction, and what the organism is engaged in doing when it is 
applied. Undesirable and major long-term consequences must 
be distinguished from the often negligible minor short-term effects 
of noise exposure. As with all psychological phenomena, indi- 
vidual differences are marked, some persons even temporarily 
performing better in noisy surroundings; but the over-all picture,« 
so far as beneficial pupil modification goes, is such as to give full 
scientific support to all efforts to reduce needless noise in school 


_ buildings on the grounds of both administrative efficiency and 


‘humanity’ as a social policy. 


I. THE GENERAL PROBLEM 


It is a commonplace observation that the overcrowding incident 
to our urban industrial life exposes the nervous system to more 
strain and tension than the presumably more serene milieu of our 
more rurai forebears. One prominent feature in this stress situa- 
tion is the perpetual overstimulation of the auditory mechanism. 
The often futile quest of contemporary man for ‘peace and quiet’ 
is not only a search for the less worrisome existence he has lost, 
but literally a pursuit of silence of the Quaker variety. The 
problem is international as one may gather from an inspection of 
British, French, German, and especially American references to 
this topic. In 1930, New York as a municipality engaged in an 
official and monumental inquiry on City Noise; and Chicago 
and other metropolitan centers here and abroad established 
permanent Noise Abatement Commissions staffed by medical, 
engineering, legal, and other specialists to deal with this widely- 
ramified issue. 

Referring specifically to New York, the poet Joyce Kilmer 
wrote these lines: | 


} 


The truck and motor and trolley car and the elevated 
train 
They make the weary city street reverberate with pain. 


Apparently, sensitive and ‘intellectual’ observers—not neces- 
sarily crotchety old folks—have always reacted in this fashion. 
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Schopenhauer was tortured by the crack of a wagon whip and 
believed (like many teachers) that “noise is the true murderer of 
thought.” By ‘thought’ he meant what we now more precisely 
label ‘sustained intellectual endeavor.’ Herbert Spencer was so 
much affected by noise that he used to plug his ears with wool, 
and declared (perhaps somewhat unfairly and projectively): 
“You might gauge a man’s intellectual activity by the degree of 
his intolerance of unnecessary noise.” 

Yet the matter is far from simple or unambiguous in its implica- 
‘tions. A lover of the Manhattan roar that helped to kill him, 
O. Henry, declared in his Adventures in Neurasthenia that—like 
many chronic urbanites—he could not sleep without the familiar 
and comforting lullaby of city noise, and complained (not alto- 
gether humorously) of the silence of the country so deep he could 
hear ‘‘the grass blades sharpening themselves against each other.”’ 
There are many subtle psychological aspects to the appareutly 
straightforward problem of noise in the modern world. 


II. THE ALLEGED ILL EFFECTS OF SCHOOL NOISES 


Mental hygienists are all but unanimous in holding that noise 
interferes witb attention and concentration, thereby making the 
tasks of teachers and pupils more difficult. Many psychiatrists 
hold that growing children are in need of extreme quiet during the 
day if they are to develop normally, a view partially supported by 
the observation that laboratory animals in quiet cages thrive 
more on the same diet than ‘controls’ exposed to prolonged noise. 
There have been repeated suggestions that there is some con- 
nection between life against a constant background of noise and a 
disposition to criminal or anti-social acts, an hypothesis probably 
based on the observation that some juvenile delinquents in classes 
for the retarded manifest a belligerent mood akin to that which 
normals exhibit when they are kept keyed up by unpleasant 
or excessive stimulation. Physicians frequently emphasize that 
“‘everyone recognizes the difference between children reared in 
quiet, peaceful surroundings and those brought up amidst the 
. roar and din of traffic.” 

A plausible derivative of this approach is the assertion that 
what has been called ‘jazz-mindedness,’ i.e., jerkiness in thinking, 
is a symptomatic result of life led among interruptions and dis- 
tractions of all kinds, of which the auditory are but one major 
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a : class. Fifteen years ago, Foster Kennedy, the distinguished 

ee neurologist, declared: 

“The fact that school children cannot concentrate so well 
under the influence of noise has a profound effect on their work. 
It often means that whole hours of the day are completely 
wasted, because if we do not concentrate well, we cannot remem- 
ap ber—only those ideas on which we have trained the full search- 
a light of our conscious mind become clearly recorded in our memory. 
| ‘Children may sit all day in the noisy schoolroom never learn- 
\ ing how to focus this searchlight upon the facts before them. 
et They may finish their full years of schooling and never achieve 
a / any clear pictures in their memory.” (p. 248 of City Noise, New 
ay | York, 1930.) 

i This same specialist also showed by means of operative surgery 
that the increase in brain pressure associated with a sudden loud 
noise (such as playfully bursting an air-inflated paper bag) is 
greater and more protracted than even those produced by shock- 
producing drugs. However, such impressive methods of present- 

[a the data about noise are in danger of overstating the case 
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and tend to be misleading so far as ordinary persons under 
ordinary conditions are concerned. 


III. METHODS OF MEASURING NOISE 


More prosaic but equally relevant are the clever measures of 
sound intensity or loudness that are now commonly made by 
acoustical engineers. To understand them it is necessary to 
supply a few reference points in terms of the following readily- 
perceived situations translated into ‘noise units’: 


A subway local station with the express passing = 95 decibels 


_ Average of six factory locations in New York =65 “ 
Average Manhattan business office =50 “ 
Average New York City residence =30 “ 
A quiet garden =20 “ 


The above values, of course, are approximate only. Loudness 

measurements made in school buildings by Rettinger (‘‘ Yardstick 

on Sound,” Nation’s Schools, 1937, 19, 56) may be usefully com- 

Rem pared to the ‘standards’ above. He found a figure of 69.7 
gs decibels for a noon-hour corridor—a situation which Jacques 
54 Barzun has described as like that of an aviary on fire, since the 
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noise volume is higher than that of a factory !—48.2 decibels for a 
classroom twelve by twenty-eight by forty feet; 47.3 for one 
twelve by twenty by thirty; and 51.0 for one measuring fourteen 
by thirty by forty-eight. These last figures are obviously com- 
parable to those prevailing in a large or mass-desk commercial 
office. 

Harvey Fletcher, the famous scientist of the Bell Telephone 
Laboratories, reported (on p. 143 of City Noise. supra) that “the 
average street noise level of Public School No. 7, 3205 Kingsbridge 
Avenue at 232nd Street, is about 48 to 53 decibels; for Public 
School No. 33, 2424 Jerome Avenue and Fordham Road, the 
corresponding level is 58 to 68 decibels.’ 

In assimilating these figures, it must not be forgotten that the 
decibel (or one-tenth of a bel) is a psychophysical unit that has to 
be interpreted logarithmically. What this means practically 
may be conveyed by comparing this conception of loudness inten- 
sity with temperature. The entire audible range of sounds for 
the human ear runs roughly between 5 and 115 decibels. A noise 
of 100 decibels is crudely as uncomfortable as a temperature of 
100°; also a reduction of noise to 70 decibels creates subjectively 
about the same sense of relief to the organism as a drop in heat to 
70°; but this parallel should not be pushed too literally. 

Unfortunately, much of the accuracy of measurement obtain- 
able by this method is not strictly relevant, since the disturbing 
effects of most noise are ordinarily less a function of the absolute 
quantity or noise level (although this is not to be neglected) than 
a product of the special qualitative noise pattern with its dis- 
tinctive meaning as an ‘interruption’ of an existing mental set 
plus whatever allied unique apperceptive significance this has to 
the hearer. 

At least this degree of psychological subtlety or sophistication 
' about the subjective life is a prerequisite to practical control. 

Protection against the injurious effects of noise is something 
like air-conditioning in the sense that it increases the feeling 
of comfort and well-being. Bad smells, bad sights, and bad 
sounds are a priori ‘aesthetic’ impediments to that relaxed state 
of euphoria which seems to facilitate most learning operations. 
The strictly physically-minded argue that if continued exposure 
to the pounding of the ocean waves fatigues the body of the surf 
bather, why should not the unremitting impact of a rough sea of 
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sound waves—which are equally real—do likewise, even though 
in somewhat less tangible form? 


IV. COMPARATIVE PUPIL PERFORMANCE IN QUIET AND 
NOISY SETTINGS 


With seventeen Lincoln High School pupils as subjects, Brown 
(‘Experiment To Show the Effect of Noise,” Science Education, 
1938, 22, 343-348) used two equivalent forms of the Otis Self- 
Administering Test of Mental Ability, Higher Examination. 
One form was given under so-called ‘normal classroom condi- 
tions’; the other was taken while a ‘hiorrible noise,’ described as 
akin to a combined airplane roar and squawking phonograph, 
filled the room. The average scores favored the ‘quiet’ arrange- 
ment, since a mean drop of three points was registered by the 
group under ‘noisy’ conditions. A better picture of the differ- 
ence may be obtained by noting that fifteen pupils lost one to 
eight points, and only two gained one and two points, respectively. 
It should be observed, too, that Brown made his set-up difficult 
for his own apparent thesis by using a well-known intelligence 
test, which, if properly constructed and standardized, is notori- 
ously resistant to all but gross alterations in the functioning 
capacity of the person being tested. 

A more elaborate experiment with a similar conceptual basis is 
reported by Flexner (“‘How Noise Affects School Work,’’ Ameri- 
can School Board Journal, 1932, 84, 85-86). His study was based 
on two hundred boys in the fourth to twelfth grades inclusive at 
the Riverdale School, in accordance with the following complex 
but systematic plan: 

On the first day, the experimenters familiarized the subjects 
by taking later unused tests under medium noise conditions. 
The noise source consisted of composite tones emitted by an 
audiometer—a mild form of stimulation when compared to 
ordinary traffic din. , 

On the second day, the Shank Test of Reading Comprehension, 
Form A, and the Woody-McCall IV were administered under 


moderate noise to Group A and in quiet to Group B. 


On the third day, Group B received Shank, Form B, and 
Woody-McCall II under moderate noise, i.e., 55 decibels, and 
Group A in quiet, i.e., ordinary room noise of about 30 to 35 


decibels. 
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On the fourth day, Group B received Woody-McCall III in 
quiet and Group A both Woody-McCall and Shank C under loud 
noise (70 decibels). 

On the last day, Group B took Woody-McCall I and Form C of 
the Shank test under loud noise, and Group A received Woody- 
McCall I in quiet. | 

It will be seen that this procedure in effect stacked the cards 
against the experiment by permitting a substantial practice- 
effect to operate over a period of almost a week to create the 
much-encountered adjustment or ‘accommodation’ phenomenon 
met with in dealing with most cases of organic resistance to 
noxious stimuli. Nevertheless, the mean results are quite 
revealing as the following figures show: 


Woody-McCall Test Quiet Medium noise Loud noise 
Elementary School 18.35 19.30 17.13 
Junior High School 26.27 27.79 24.29 
Senior High School 28 .47 29.01 27 . 54 

Shank Test 
Elementary School 56.31 56.28 56.06 
Junior High School 64.58 63.55 61.54 
Senior High School 76.81 76.93 76.75 


None of these differences are spectacular, but they are excep- 
tionally consistent in demonstrating a uniform decline in efficiency 
of even basic or routine mental processes under the noisy condi- 
tions imposed. 

A confirming inquiry of a less extensive nature has been pre- 
sented by Hsaio (‘‘An Experiment on the Influence of Noise 
Upon Work,” Chinese Educational Review, 1937, 27, 99-102). 
For ten minutes, thirty-five fourth-grade pupils were required to 
work on the same one hundred arithmetical operations, multiplica- 
tion of a two-place number by a digit-place number, first under a 
noisy and then (after eight days) under a quiet environment, 
the instructions being the same. His results showed that noise 
had a detrimental effect on work in respect to: (1) total number of 
multiplications, (2) percentage of wrong answers, and (3) multi- 
plications correctly worked. The noise caused a decrease of 
speed by 5.6 per cent, an increase of wrong answers by 26.6 per 
cent, and a decrease of ‘efficiency’ by 8.4 per cent. Such per- 
centage changes reported here (and elsewhere in this paper) 
must always be cautiously interpreted. 
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From the Orient, we leap to Europe where much the same rule 
of behavior seems to have been confirmed by Schnell, an Hun- 
garian investigator (‘‘A zaj hatasa az idegrendszerre,” Varosi 
Szle, 1933, 39, 35-45). The Kraepelin Addition Test was taken 
for fifty minutes by twenty-four boys and twenty-seven girls, 
both groups averaging eighteen years old. The first experimental 
day was without noise, the second under the influence of a 
monotonous noise made by a buzzer with a megaphone, the third 
under that of a strong noise of variable intensity reproduced by a 
phonograph record amplified by radio. The monotonous noise 
had no deteriorating but rather a stimulating effect upon the 
performance of both sexes. However, street noises diminished 
the performance of the boys more than that of the girls. In both 
cases, the quality of the work was injured more than the quantity 
—a, commonly-observed phenomenon. 

The leading Hungarian psychologist, Paul Ranschburg, com- 
menting on these findings, which were made under his supervision, 
added the observation that noise hinders the carrying off of waste 
products, disturbs the normal biotonus, and demands the con- 
tinual forced absorption of psychoneural energies required for 
concentration against the diversion of attention through these 
noises. He emphasizes the fact that affective elements may 
augment the damage caused by small amounts of noise—an 
observation frequently made by other authorities in this 
area. 

Different in nature from the type of evidence offered above, but 
important because of its source, is the account of his anti-noise 
activities presented by Irwin T. Catharine, Superintendent of 
Buildings for the Philadelphia Public Schools, (‘‘ Acoustical 
Experiments in the Classroom,” Nation’s Schools, 1936, 17, 4950; 
55-56). Although no true controls were employed, he states that 
the ‘sound-proofing’ of sample spaces resulted in much better 
behavior in going to and from classes, reduction of reverberating 
corridor noises, better attention from the children in class, and 
superior discipline because whispering is too conspicuous when 
it resounds clearly in an acoustically-treated classroom. Because 
they can now be heard better, shy and weak-voiced children par- 
ticipate more in the class activities than before. The teachers 
apparently were enthusiastic and campaigned vigorously for the 
extension of these benefits to all classrooms and buildings. 
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V. MISCELLANEOUS LABORATORY EVIDENCE 


While no one can doubt that it is a gross economic waste when 
teaching, concentration, industriousness, and learning are main- 
tained with effort, it should not be forgotten that ‘noise is essen- 
tially any sound which the hearer treats as a nuisance.’ It is 
merely unwanted auditory stimulation, and therefore dependent 
upon the larger ‘values’ or specific temporary interests of the one 
hearing. A blaring radio that enrages one listener brings smiles 
to another. The Mellon Institute (rather, H. M. Johnson, who 
worked there) found years ago that the noise that disturbs the 
sleep of one may actually enhance the rest of another. Complex 
biological rhythms also play their parts: the time of the night 
when a sleeper stirs the least is not the time at which street noises 
are at a minimum. | 

Yet the deleterious effects of certain noises on organic functions 
have been repeatedly demonstrated. In an experimental study 
of the effect of noise on the gastric secretion in four ‘‘ Pavlov 
dogs,’”’ P. E. Vaughan and E. J. Van Liere (Journal of Aviation 
Medicine, 1940, 11, 102-107) discovered that the amount of acid 
was significantly less in two dogs under 100 decibel noises than 
under 30 decibel conditions. A parallel inquiry on the loudness of 
auditory stimuli which affect stomach contractions in four healthy 
young persons by E. L. Smith and Donald Laird (Journal of the 
Acoustical Society of America, 1930, 2, 94-98), showed that 
auditory stimulation of 87 decibels decreased by 37 per cent the 
number of stomach contractions per minute and completely 
altered the type of contraction involved; indeed, the effects were 
remarkably similar to those produced by fear. It has long been 
acknowledged that city noises are to a considerable degree respon- 
sible for the prevalence of digestive disorders in modern life. 
Laird (Medical Journal and Record, 1932, 136: 12) reports that the 
flow of gastric juice was depressed by stimuli of 40 and 60 
decibels, respectively—loudness intensities that are typically 
present in New York City public schools (vide supra)! 

Because of the unity of the organism, one would rightly expect 
that more than the digestive system is involved in this complex 
reaction, and this anticipation finds confirmation in another study 
by Laird on the influence of noise on production and fatigue as 
related to pitch, sensation level, and steadiness of noise (Journal 
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of Applied Psychology, 1933, 17, 320-329). He had four young 
men work on the standard laboratory dotting machine; all 
reduced production when varying complex noises were applied. 
Under intense noises, they reacted with muscular stiffness in the 
neck and legs, and there was a great increase in the volume of 
urine secreted. 

A Japanese research by Obata, Morita, Hirose, and Matsumote 
on the effects of noise on human efficiency (Journal of the Acous- 
tical Society of America, 1934, 4, 255-261) used juvenile and adult 
subjects on cancellation and addition tests under conditions of 
silence, music, and noise. They found that both noise and music 
generally lowered working speed and accuracy. 


VI. SUPPORTING INDUSTRIAL INVESTIGATIONS 


Enlightened engineers and managers have for many years been 
concerned to produce optimum working conditions for their 
employees, since they have discovered it brings solid monetary 
returns. A sample of such efforts is found in an article by Hodge 
(Personnel Journal, 1936, 15, 11-18). He achieved a reduction of 
5 to 15 decibels in the noise level of a room which the occupants 
declared was equivalent to the difference between a noisy office 
and a quiet one. A reduction of forty-two per cent and twenty- 
four and one-half per cent, respectively, in clerical errors occurred 
in two offices where sound-absorbent materials had been installed. 
Psychological tests demonstrated a mean forty per cent loss in 
typing speed, together with a nineteen per cent increase in the 
amount of energy used under noisy conditions. 

A related inquiry on the fatigue reaction to noise (Industrial 
Welfare, 1935, 35-37) showed that basal metabolism when typing 
increased fifty-two per cent over rest when the room was quiet 
and seventy-one per cent when it was noisy! Production 
increased in assembling temperature regulators when noises were 
systematically reduced. 

Representative of the more conservative but technically 
superior British studies in this field is the standard analysis made 
by Weston and Adams on the effect of noise on the performance of 
weavers (Industrial Health Research Board, 1932, Report No. 65, 
pp. 38-62). Active textile machinery notoriously makes a great 
din and these studies were done directly in a weaving shed. 
Specially-constructed ‘ear-defenders’ effected an appreciable 











The Effects of Noise on School Children 159 


reduction of loom noise; a selected group of twenty-six weavers 
- was observed throughout a period of twenty-six weeks. The 
reduction of noise appeared to bring about an average hourly 
output per weaver of approximately one per cent. This increase 
was greatest during the early stages of the day’s work and it is 
therefore suggested that “even after years of work in a noisy 
environment, the worker does not become completely adapted or 
acclimatized to noise, but goes through the process of adaptation 
daily.” This is an important conception, for it implies that in a 
deeper sense one’s system never really gets used to noise, and that 
the wear and tear on one’s tissues goes on continually under such 
conditions. 


VII. CONCLUSIONS AND RECOMMENDATIONS 


The summary at the beginning of this report should be read 
again at this point. It can be confidently asserted that unless the 
normal person is occupied in certain highly specific tasks, noise 
as such does not produce either permanent or transitory impair- 
ment of hearing. The evil effects are of quite a different order, 
such as occur when sounds we wish to hear (say, a teacher’s 
voice) are masked or obscured by the high pitches of unwanted 
sounds (one of the unrecognized effects of children’s ‘screeching’). 

Ever since the publication of Morgan’s classic work on distrac- 
tion via sounds (Archives of Psychology, 1916), it has been a 
truism among psychologists that the disturbing effect of sound 
when one is engaged in some activity, is not a pure function of the 
noise per se, but of the noise in relation to the task. Like all 
figure-ground phenomena, this one obeys the rules of Gestalt 
organization. ¥On the whole, therefore, the more the work puts a 
demand on the higher mental processes, the more disturbing the 
noise is likely to be.y 

The British authority, Bartlett, has this to say on the Problem 
of Noise (Cambridge, 1934, p. 54): 

“There is one interesting reason which tends to tie up noise 
with nervous complaints. Very intense sounds, sounds of high 
pitch, and sounds whose source cannot readily be found or seen 
are all primary fear stimuli. In their presence most young 
children shrink and show the behavior characteristic of fear or 
timidity, and at some time or other, when sounds possessing 
these characters have occurred, most adults must have felt the 
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unmistakable thrill of fright. Fear, of all human reactions, is 
probably the one most likely to get tied up with nervous disorders.”’ 

From this it follows that even when, as frequently happens, 
able or highly motivated pupils maintain approximately the 
same level of achievement under noisy or ‘silent’ conditions, they 
succeed in doing so only by virtue of putting forth additional 
effort to overcome the new obstacles. Most of us usually work 
far below our emergency possibilities—and it is well that this is so, 
for if we labored at peak capacity all the time any fresh and 
unanticipated demand on our reserves would cause instant 
collapse. 

Hence, every reasonable and economically justified effort to 
reduce noises in the school plant appears to further the aims of the 
educational process. New building construction during the 
postwar era should place this consideration high on its list of 
priorities if teachers and pupils are to be spared the needless 
inefficiency of trying to go uphill with the brakes on. 








CORRELATES OF HANDEDNESS 
AMONG COLLEGE FRESHMEN 


J. R. WiTTENBORN 
Yale University 


THE CEREBRAL DOMINANCE THEORY 


With respect to certain functions at least, one of the cerebral 
hemispheres in man assumes a dominance or a prepotence over 
the other. This fact has been employed in speculations regarding 
the origin and nature of certain anomalies in the development 
of language function. Among the psychological and educa- 
tional publications may be found numerous assumptions (albeit 
conflicting evidence) for relationships between language facility 
and cerebral dominance or its external manifestations, e.g., 
laterality. This literature comprises a group of loosely defined 
assumptions which is frequently termed a theory. A well known 
discussion of this theory* and a statement of its significance for 
reading, writing, and speech problems among children has been 
contributed by Orton. 

Employing both neurological and evolutionary arguments, 
Orton propounds an intimate, interdependent relation between 
handedness and facility in the language function. In conse- 
quence it is claimed by him that optimal development of lan- 
guage faculties may occur when the preferred hand is contralateral 
with (opposite from) the dominant hemisphere. It is implied 
by him that maximal cerebral dominance is unlikely in the case 
of confused manual preference, and that optimal language 
development is not only improbable, but that problems in 
language skills may occur. Specifically, the following hypo- 
thesis is deduced from Orton’s discussion and is put to test in 
the present investigation: 


Hypothesis I. Confusion in manual preference has as 
a probable consequence disturbance or sub-optimal 
development of language function. 





* Orton’s scholarly discussion does not state a theory in any formal sense, 
and his position regarding many of the specific relations between handedness 
and language facility is not unambiguously stated by him in Reading, Writing 
and Speech Problems among Children. 

* The first hypothesis of the present investigation, which states a specific 
consequence of Orton’s general discussions, is the writer’s deduction and 
may not be subscribed to by Orton. 
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If this hypothesis is valid for college students, a knowledge of 
the individual student’s handedness could be a very useful 
adjunct in diagnosing difficulties in reading and other language 
disabilities. It was this attractive possibility which led the 
writer to undertake the present investigation. 

In the investigation, Hypothesis I is tested by forming four 
groups of students who, within the limits of the inquiry, are: 


RR) consistently right handed 

RL) usually inclined to prefer the right hand but occa- 
sionally prefer the left 

LR) usually inclined to prefer the left hand but occa- 
sionally prefer the right 

LL) consistently left handed 


It is assumed, if the hypothesis is to be sustained in this 
investigation, that groups RL) and LR) will be inferior in cer- 
tain aspects of language facility to groups RR) and LL). 


AN ALTERNATIVE HYPOTHESIS 


It is not unusual to encounter reading-handicap cases which 
reveal strikingly the classical pattern of specific language dis- 
ability: confused (mixed) manual preference, a history of speech 
defects, difficult spelling, slow reading, the general pattern of 
strephosymbolia, etc. Nevertheless, as Orton has noted, obser- 
vation reveals an impressive number of people whose manual 
preference is not well defined but for whom no evidence of lan- 
guage disability is obtainable. In other respects, however, 
handedness may have a consistent significance. Both a priori 
consideration and observation suggest that left-handed indi- 
viduals are handicapped in a right-handed society. Therefore, 
in contrast with Orton’s general argument, an alternative hypo- 
thesis concerning the general scholastic significance of handed- 
ness is proposed and éxamined in the present study: 


Hypothesis II. Individuals who prefer the left hand 
will be handicapped in performances which particularly 
reflect the habits of a right-handed society. 


Within the limits of the investigation, it is assumed that 
Hypothesis II will be refuted if performances which obviously 
involve manual facility (e. g., writing, ciphering) do not reveal 
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(Please answer these questions as accurately as possible. Your 
response will help us in evaluating your reading and studying 


skills.) 
1) At what age were you, in comparison with your contem- 
poraries, least skillful in reading? . Check your stand- 





ing at that time: 





1 2 3 4 5 6 7 
very poor average very good 
2) At what age were you, in comparison with your contem- 
poraries, least skillful in spelling? . Check your stand- 
ing at that time: 








1 2 3 4 5 6 7 
poorest in average best in 
class class 


3) At what age were you, in comparison with your contem- 
poraries, least skillful in writing (good writing is fast, legible and 








attractive)? Check your standing at that time: 
1 2 3 4 5 6 7 
poorest in about best in 
class average class 


4) At what age did you, in comparison with your contem- 
poraries, find speech most difficult? . Check your 
standing at that time: 








1 2 3 4 5 6 7 
stammered about very facile 
badly average speech 


5) With which hand do you throw more easily? 
6) With which hand do you deal cards more easily? 
7) Which hand is stronger? 


8) Which is your favored (more useful) hand? 
Has this always been your favored hand? 

















9) With which hand do you write more easily? 
Fie. 1—Questionnaire. 
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a tendency for left handedness to be associated with a degree 
of skill inferior to that associated with right handedness. 


THE DATA AND THEIR ANALYSIS 


Responses to the accompanying questionnaire were employed 
for the purpose of defining the handedness groups and of securing 
a self-estimate of certain aspects of language facility. 

Two factors which determined the use of a handedness ques- 
tionnaire in preference to a more objective laboratory examina- 
tion were: convenience and a desire to test the alternative 
hypotheses with the type of device which would make applications 
on a large scale feasible. The questionnaire was administered 
by the writer as a part of the general reading tests which are 
regularly given to all members of the entering freshman class at 
Yale. | 

As a first step in the analysis, an entire freshman class at Yale 
was divided into four groups on the basis of questionnaire 


responses: 


(RR) those who had consistently used the right hand 

(RL) those who preferred the right hand but used the 
left in one or more of the listed activities; and those 
who at some previous time had preferred the left 
hand, but now used the right 

(LR) those who preferred the left hand but used the 
right hand in one or more of the listed activities; and 
those who at some previous time had preferred the 
right hand but now preferred the left 

(LL) those who had consistently used the left hand 


In testing Hypotheses I and II, these four groups were com- 
pared with each other in the following respects: 


A) Self-ratings: 
1. Reading 
2. Writing 
3. Speech 


B) Test Scores: 
1. Reading Rate 
2. Reading Comprehension 
3. Scholastic Aptitude Test (Verbal), C. E. E. B. 
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4. English Essay, C. E. E. B. 

5. Verbal Reasoning, Yale freshman Aptitude Battery 

6. Mathematical Aptitude Test, C. E. E. B. 

7. Quantitative Reasoning, Yale freshman Aptitude Battery 
8. Spatial Visualization, Yale freshman Aptitude Battery 


In addition, for the particular purpose of testing Hypothesis 
II, groups formed on the basis of response to questionnaire items, 
5, 6, 7, 8, and 9 were examined with respect to above ratings 
and tests. 


TABLE I.—RELATION BETWEEN SELF-RATINGS AND HANDEDNESS 
HaBITs 
Mean of self-ratings* 
Handedness Groups: _N Reading Spelling Writing Speech 


RR 219 4.2 4.0 3.4 4.3 

RL 111 4.1 4.5 3.5 4.3 

LR 18 4.2 4.5 2.9 4.5 

LL 20 3.1 4.7 3.4 4.3 

Activity Preference: [see questionnaire] 

Throw R 360 4.1 4.4 3.4 4.3 
Throw L 47 3.9 4.7 3.2 4.5 
Cards R 315 4.2 4.4 3.4 4.3 
Cards L 84 3.7 4.4 3.3 4.3 
Stronger R 330 4.1 4.4 3.5 4.3 
Stronger L 65 4.0 4.6 3.2 4.3 
Prefer R 364 4.1 4.7 3.7 4.3 
Prefer L 44 3.8 4.5 3.2 4.4 
Write R 370 4.1 4.4 3.4 4.4 
Write L 40 3.6 4. 3. 4.4 


* This analysis was made for all of the members of the class of 47J. Due 
to the fact that an occasional student was unable to give an unqualified 
answer to some of the questionnaire items, the frequency of subgroups in 
Tables I and II may not be combined to give a fixed total. 


The revelance of handedness to self-ratings for reading, spell- 
ing, writing and speech is summarized in Table I. From the 
table it may be seen that individuals who prefer the left hand in 
any of the activities listed (questionnaire items 5 to 9) also tend 
to rate themselves as the poorer readers.* It appears from 





* None of the handedness groups was significantly inferior on the tests oi 
reading rate and comprehension. 
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further examination of the four handedness groups that this 
tendency is due to those individuals who employ the left hand in 
all of the activities. Why reading self-ratings are lowest for the 
uniformly left-handed is not immediately apparent. It is possi- 
ble that those handicapped by an almost exclusive use of the left 
hand may be slow and cautious in all scholastic activities and as 
a consequence read slowly ; slowness is almost universally regarded 
as a reading handicap. 

Table I also reveals a consistent tendency for those who report 
the use of the left hand in various activities to give their hand- 
writing a lower rating than those who use the right hand in the 
respective activities. The data for Table I reveal no tendency 
for self-rated disabilities in speech and spelling to be related with 
mixed-handedness. Such a tendency is required, however, by 
Hypothesis I. It is apparent in general that the self-rating data 
challenge the validity of Hypothesis I. In only one instance 
does a mixed-handedness group, (LR in the case of writing), rate 
itself lower than the uniform groups (RR and LL). In two 
instances a uniform group rates itself lower than the mixed 
groups: RR in the case of spelling; and LL in the case of reading. 

Proceeding in the analysis, the relationships between handed- 
ness and scores on the eight tests listed above were scrutinized. 
For the reading, tests, the Scholastic Aptitude Verbal, and the 
Yale Verbal Reasoning—no phase of the analysis revealed either 
a difference approaching statistical significance or a discernible 
trend. For the remaining tests, however, trends were revealed. 

The data for Table II like those of Table I offer no support for 
the cerebral dominance hypothesis and suggest that the signifi- 
cance of handedness in scholastic skills is a consequence of the 
direct handicap of left-handedness in a right-handed society. 
If the deduction from Orton’s discussion were to be supported, 
one should expect to find the people with mixed handedness 
(groups RL and LR) to be consistently lower than the primarily 
left (LL) or right (RR) handed groups on the English Essay, 
as well as, the scholastic aptitude verbal, the Reading, and 
the verbal reasoning tests. The data do not reveal such 
differences. 

The English Essay examination is written by hand and scored 
somewhat subjectively by readers. It is likely that on such an 
examination a person who writes laboriously or illégibly would 
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be penalized. It is of interest to note that those who prefer the 
left hand but use the right hand in some activities (RL) not only 
form the handedness group which does most poorly on the Eng- 
lish Essay, but they are also the students who gave themselves 


TABLE II.—RELATIONSHIP BETWEEN TEST SCORES AND 
HANDEDNESS 
Mean of test scores 
Quanti- Mathe- 
Eng- tative matical Spatial 
lish Reason- Ingenu- Visual- 
Handedness Group: N* Essay! ing ity? ization 


RR 324 56.5 54.3 55.5 55.2 
RL 137 —s_(«B4..3 55.5 55.0 55.3 
LR 34 8=62..5 54.0 53.5 57.4 
LL 28 56.0 54.1 51.5 55.0 
Activity Preference: 
Throw R 360 -— 55.5 55.4 55.5** 55.1 
Throw L 47 53.4 53 .6 51.6 53.8 
Cards R 315 55.3 55.4 55.6 55.4 
Cards L 84 54.9 54.9 63.7 53.7 
Stronger R 330 55.9°%* 55.3** 55.4** 55.4** 
Stronger L 65 52.8 52.8 52.5 652.4 
Prefer R 364 55.7** 55.3 55.5** 55.5 
Prefer L 44 53.4 54.1 52.5 53.9 
Write R 370 ~=—s-: 555.5 55.3 55.4** 54.9 
Write L 40 53.5 54.8 52.5 55.4 


* Since the mental test data for the handedness groups were considered 
especially critical for the hypotheses, the trend observed for the class of 47J 
was confirmed by collecting the appropriate data for the small classes of 
47M and 47N, the same pattern for all classes was observed and the com- 


bined scores are given. 
1 The difference between RR and RL and RR and LL is significant at the 


five-per-cent level. 
2 The difference between RR and LR, RR and LL, and RL and LL is 


significant at the five-per-cent level. 
** Significant at the five-per-cent level. 


the lowest rating on handwriting. Why this particular group 
and not the people who use their left hands consistently have 
the lowest self-ratings in handwriting is not answerable with the 
data at hand. It is noteworthy that for each activity the stu- 





es 
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dents who favor the left hand are the ones who have the lowest 
scores on the English Essay test. 

Numerical ability, whether expressed through a measure of 
quantitative reasoning or more appropriately a test of numerical 
facility (mathematical ingenuity), may be expected to bear a 
relation to handedness. Ciphering with the left hand, especially 
at the black board, is almost invariably a handicap. Conceiv- 
ably, therefore, a person who is left-handed is handicapped in 
numerical work, especially the highly speeded variety such as 
called for by the mathematical ingenuity test. The data of 
Table II for the mathematical tests suggest that left-handedness 
is a handicap for numerical work and is in agreement with 
Hypothesis II. 


DISCUSSION 


Although the analysis for the four handedness groups failed 
to verify Hypothesis I, Hypothesis II was not invalidated. 
Nevertheless, the status of Hypothesis II as determined by this 
part of the analysis is not altogether clear. From the hypo- 
thesis it may be predicted that writing self-ratings and English 
Essay scores for the (LL) group would be inferior to the (RR) 
group, but no significant difference was found between these 
groups. The group most handicapped in writing and the Eng- 
lish Essay actually preferred the left hand (LR) but was not 
the uniformaly left-handed group (LL). 

It must be noted, nevertheless, that the (RL) group as well as 
the (LR) group is inferior to the (RR) group. This trend is 
called for by Hypothesis I, but the hypothesis also requires that 
(RL) and (LR) be inferior to (LL). Such is not the case, how- 
ever. The data for the English Essay test may not be considered 
to favor one hypotheses more than the other. 

Students who threw, dealt cards, were stronger, generally pre- 
ferred, or wrote with the left hand were consistently inferior to 
the others in performance on the English essay, quantitative 
reasoning, mathematical ingenuity, and spatial visualization tests. 
These students, moreover, gave themselves inferior ratings in 
reading and writing. Some of the differences between those 
who favored the right hand and those who favored the left were 
statistically significant, the consistency with which those favor- 
ing the right hand excelled those favoring the left is remarkable. 
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Clearly, in the above overlapping activity group samples, a 
tendency to use the left hand has as a consequence a slight but 
nevertheless consistent handicap. This handicap is most reliable 
for speeded numerical work, i. e., the mathematical ingenuity 
test. 

In the literature may be found numerous accounts of careful 
investigations which have failed to yield evidence for a cerebral 
dominance theory of handicaps in reading and speech. An 
examination of some of these accounts yields evidence which 
supports the writer’s Hypothesis II. For example, incident to 
a study for peripheral vision in school children, LaGrone and 
Holland! found preference for the left hand is associated with 
low scores on the Otis Quick Scoring Test for Mental Ability 
and the Gates Primary Reading Test. 

The effect of laterality on elementary-school stutterers was 
the subject of a lengthy investigation by Spadina;‘ it was con- 
cluded that none of the evidence could be accepted as establish- 
ing an association between aspects of laterality and stuttering: 
“Although the differences were not significant the following 
trends were found: 

“1) a few more stutterers than non-stutterers were left- 
handed, left-eyed, left-footed, left-sided. 

“‘2) a few more non-stutterers than stutterers were right- 
handed, right-eyed, right-footed, right-sided.” 

A third study suggesting a slight handicap for those who 
favored the left hand has been published by Leavell and Fluts.? 
Those two investigators concluded that: 

“These results reflect that left dominance (lateral, not cerebral) 
is less favorable to the acquisition of reading skill than right 
dominance and that conflict between eye and hand dominance 
is less favorable (to reading) than is complete left dominance. 
Impartiality of dominance reflects very little upon reading 
achievement.’’* 


CONCLUSIONS 

As a result of the present study and other published studies, 

ambidexterity and confused or undetermined handedness appear 

to have a negligible if not nonexistant significance for language 
facility. 


* The parenthetical insertions are the writer’s. 
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A tendency to favor the left hand, however, may result in a 
slight handicap for both school children and college freshmen. 
This handicap is most apparent for the mathematical ingenuity 
test. The nature of the handicap is conceivably the result of 
two factors: mechanical inconvenience and emotional conse- 
quences. The difference is small, however, and should not be 
taken as evidence for justifying deliberate attempts to change 
the manual preference of children. 
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THE VALIDITY OF CERTAIN OBJECTIVE 
TECHNIQUES FOR MEASURING THE 
ABILITY TO TRANSLATE 
GERMAN INTO ENGLISH* 

HENRY 8. DYER 
Harvard University 


Teachers of foreign languages sometimes express distrust of 
objective test methods on the ground that they do not provide the 
student with an adequate opportunity to demonstrate his ability 
to translate from one language into another. The objective test, 
they say, may furnish a reliable measure of the knowledge of 
isolated words, grammatical forms, and idioms, but the possession 
of such knowledge, though necessary, is not a sufficient guarantee 
that the student can apply it in producing a good translation. 
This sort of criticism which relies on the internal logic of the 
situation, seems entirely reasonable. Although the present study 
is based on a small number of cases and is, therefore, only explora- 
tory in character, its purpose is to discover whether the conten- 
tion of the language teachers has any basis in fact. 

Twenty-one college women, who had been studying beginning 
German for approximately five months, were given an objective 
test containing items of the recall type. The test consisted of four 
parts as follows: 


Part I. Fifty German words for each of which the student 
was asked to give one English equivalent. (Fifty scorable 
responses) 

Part II. Fifty English words for which the student was 
asked to give one German equivalent. Twenty of these 
words were nouns. For each noun the student was asked 
to give not only the German equivalent, but also the article 
for the nominative singular and the noun in the nominative 
plural. (Ninety scorable responses) 

Part III. Ten German verbs for which the student was asked 
to give the four principal parts (in German) and one Eng- 
lish meaning. (Fifty scorable responses) 

* The writer is indebted to Dr. George K. Zipf for furnishing the test 
materials on which this study is based; to his assistants for grading the 
translation passages; and to Professor T. L. Kelley for advice on the handling 
of the data. 
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Part IV. Twenty-five sentences to be translated from Ger- 
man to English. There were in each sentence three 
crucial grammatical forms or idioms which alone were 
scored. (Seventy-five scorable responses) 


The test therefore provided for a total of two hundred sixty-five 
possible responses. The students were given one hour to com- 
plete it. 

The reliabilities of the four parts and of the test as a whcle were 
estimated from the correlations between the split-halves by means 
of the Spearman-Brown Prophecy formula. Table 1 shows the 
reliability data for the test. 

In setting up a criterion by means of which the validity of 
the test as a measure of translation ability could be determined, 
an attempt was made to meet the following requirements: 

(1) The validity of the criterion itself should be self-evident, 
i.e., it should be based upon competent evaluation of students’ 
actual translations. 


TABLE 1.—RELIABILITY OF THE OBJECTIVE TEST 
(N = 21) 
Number of 

Tu 

Part I ; ; 71 
Part II : .89 
Part III ; ; .90 
Part IV 75 : .77 
Total 31. .94 


(2) The criterion should be a ‘pure’ measure, i.e., irrelevant 
factors should not be permitted to influence the evaluations. 

(3) An unbiased estimate of the reliability of the judges 
evaluating the translations should be available. 

(4) An unbiased estimate of the reliability of the samples 
of translation evaluated should also be available. 

Two weeks after the administration of the objective test, the 
students were given five short passages of German to be trans- 
lated into English. The passages ranged in length from sixty-six 
to one hundred two words. An effort was made to provide 
passages of approximately equal difficulty in terms of the knowl- 
edge of words and grammatical structure needed for their 
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translation. In none of the passages was there any word or 
grammatical form to which the student had not been exposed by 
the time he had taken the objective test. Hence, the passages to 
be translated and the objective test can be presumed to sample the 
same population of words and grammatical forms. 

The translations were written during an ordinary class period 
and required thirty-five to forty minutes for their completion. 
Immediately thereafter each of the five passages translated by 
each of the twenty-one students was coded for identification and 
typed in triplicate on separate slips of paper. These slips were 
then given with a set of instructions to each of three judges, two 
of whom were instructors in the course. The judges were told (a) 
to arrange each set of translation passages in order of merit, (b) to 
grade each set of translations on a scale from 0 to 100, (c) to com- 
plete the ranking and grading of the translations of Passage 1 
before going on to those of Passage 2, etc., (d) to avoid any dis- 
cussion of the work until the slips of all the judges had been 
returned. It is believed that the foregoing precautions effec- 
tively eliminated any possibility of halo or hearsay effect in the 
grading. In the appendix the full instructions for grading the 
translations are given. All three judges were experienced teach- 
ers of college German. 

The reliability of the judges was estimated by correlating 
their ratings on each passage and also the sums of their ratings on 
each student. In each instance, the three correlation coefficients 
were converted to Fisher’s z-value,* the z-values were averaged, ' 
and the mean z-value was reconverted to7. 7 is the reliability of 
the average judge. To secure an estimate of the reliability of 
their combined judgments, the 7’s were stepped up by means of 
the Spearman-Brown Prophecy formula, 


37 
ieee oO (1) 





These data are presented in Table 2. 

Although there is some variability in the amount of agreement 
existing among the judges on the individual translation passages, 
the reliability of the average judge on the sum of his ratings for 
each student is surprisingly high. The question naturally arises 





* Fisher, R. A. Statistical methods for research workers, London. 1938. 
pp. 202ff. 
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whether the 7’s are spurious because of some uncontrolled factor. 
Although the method used insured absolute independence in the 
actual evaluations, the three instructors who judged the trans- 
lations had been associated in the teaching of elementary German 
for a number of years. In the course of this association their 
standards for grading translations of this kind should have become 


to a large extent similar. The reliability therefore of their com- 


bined judgments on all five passages (.98)—which theoretically is 
an estimate of the degree of correlation that would exist between 
these three judges and another three of similar competence— 
should not be regarded as having any universal significance. 
The most that can be said is that if the present group of judges 
were paired with another group of similar background and experi- 
ence, the correlation between the judgments of the two groups 
would probably be of a high order. This much is sufficient to the 
purposes of the study. It means that on the average the crite- 
rion has high face validity in the judgment of these German 
instructors and their colleagues. 


TABLE 2.— EstTimaTep RELIABILITY OF THE JUDGES 
Reliability of the Reliability of 
average judge combined judgments 

(7) (Fi) 

Passage 1 .89 96 
Passage 2 71 .88 
Passage 3 .70 .88 
Passage 4 . 84 .94 
94 

98 


Passage 5 .84 
All five passages .94 


The reliability of the judges should not be confused with the 
reliability of the translations as samples of student behavior; it 
merely sets an upper limit on any estimate of the latter. The 
sampling reliability. was estimated by securing the intercorrela- 
tions among the combined grades of the five passages, converting 
these correlations to Fisher’s z value, computing the mean of the 
2’s, and reconverting to obtain the average?. # was then stepped 
up by the Spearman-Brown Prophecy formula. The inter- 
correlations and the sampling reliability are given in Table 3. 
Because the material was presented to each judge in such a way 
that it was not possible for him to surmise which passages may 
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have belonged in the series translated by a given student, the 
intercorrelations among the passages cannot be considered as 
spuriously large because of the operation of any halo effect. The 
reliability found can therefore be regarded as an accurate state- 
ment of the situation. 


TABLE 3.—INTERCORRELATIONS AMONG THE COMBINED GRADES 
ON THE TRANSLATION PASSAGES 


I II Ill IV Vv 
I 54 .67 . 54 . 63 
Il .16 .39 .50 
ITT . 67 .51 
IV . 64 
M 81.0 86.0 82.7 78.0 74.5 
ry 8.7 10.1 12.2 16.0 11.5 
7 = .54 
i 
“Te” 


TABLE 4.—CORRELATIONS BETWEEN THE TOTAL AND SEVERAL 
PARTS OF THE OBJECTIVE TEST AND THE COMBINED RATINGS 
ON THE TRANSLATION PASSAGES 


Objective Test Nature of items r . ta 
Part I English equivalents for German 
words 32 .33 .41 
Part II German equivalents for English 
words .70 .87 .80 
Part III German verb forms .29 .30 .33 
Part IV Translation of short sentences .58 .67 .72 
Total test .64 .75 .71 


Mean of z-value Parts I-IV = .54 
* r.. is the r corrected for attenuation resulting from the unreliability in 
both variables. It was computed by the formula: 


Tie 


Tow ar 
V ru V ra 


It remains to determine the relationship existing between the 
criterion and the objective test. Table 4 gives the pertinent 
correlations. 

The correlations between the criterion and each of the four 
part scores are of some interest. However, the differences among 











ag tes a Ie a a et 
oe ES SE 2S ee eee 


‘geo 

i ae 

§ if 
a 


— Fim Mt 4 ap OSB, Sjet He 9 
rare ~ : “ re Se 
m — t ee? pi? Sat ce 
a Sn i Se ee Sh, > eS =i a ae ee “s ee 
’ + 2% Tr Oe GC rE - 7 a « i. tot | 
reo 8 z wos - . 


- > Se). ta nd 
et: Mes 008 
» Fy 


? 


prey, yt 


od 
, 
a 
: 


‘ 
+ ae 
‘ 
& 
¥ 
p 
a 
i 
i 
i 
: 


~ Pay et. 


. 
ere 


| 
a 
‘4 


176 The Journal of Educational Psychology 


them should not be taken too seriously. The standard deviation 
of the z-values about the mean z-value of .54 is .239, which is 
approximately the same degree of variability that one would 
expect from chance factors alone. The latter is equivalent to o, 
which is .236. It is nevertheless intriguirg to note that the 
highest correlation occurs with Part II (giving English equiva- 
lents for German words) and that next to the lowest occurs with 
Part I (giving German equivalents for English words). It is 
hoped that in asubsequent study the difference in validity between 
these two objective techniques may be further investigated. 

However, the validity of the test as a whole may be reliably 
estimated. The ¢-value equivalent to an r of .64 is 3.69. This is 
well above the one per cent level of confidence. We can therefore 
infer that the objective test as a whole has a significant relation- 
ship with translation ability as measured by the criterion. It is, 
on the other hand, of some importance to know whether the objec- 
tive test is significantly different from the criterion. The null 
hypothesis in this instance would be that the correlation (cor- 
rected for attenuation) between the objective test and the crite- 
rion does not differ significantly from unity. The writer does not 
know of any exact method by which this hypothesis can be tested. 
However, for r,,,. of the order .71, the assumption that errors of 
sampling are normally distributed may be taken to be a close 
approximation of the true situation. Using Kelley’s formula* 
for the standard error of r,,,,, 


Py see pe ali ath: 
Tren = «/N Tee + Tr, + (ar 4 + Tu Ti 1 


+ (a ~ Btn - 1)" (2) 


4rin 4 Ton 


we find that o,... = .132. Consequently, the r,,, of .71 differs 
from 1.00 by an amount which is 2.20 times its standard error, a 
difference that would be expected to occur by chance slightly 
less than fourteen times in one thousand samplings. The null 
hypothesis is therefore untenable. Consequently, it appears 
likely that when the language teachers contend that objective 


‘measures of the kind studied do not account for all of the functions 


used in translation, they probably have a valid argument. 
* Kelley, T. L., Statistical method. Macmillan, New York, 1924. p. 209. 
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INSTRUCTIONS FOR GRADING THE TRANSLATION PASSAGES IN 
GERMAN 


In order to accomplish the purpose of this experiment, it is 
important that the grading of the papers shall be done without 
any possibility of a bias which might arise from the following: 

(a) an instructor’s opinion of the student’s work in general, 
(b) an instructor’s opinion based upon another instructor’s 
opinion of the student’s work 
(c) the placing of a disproportionate weight on a student’s 
answers to only one or two of the five questions. 
Therefore, the grading of the papers has been arranged in such a 
way that the grader cannot know which student wrote the answers 
nor whether a single student wrote a given series of answers. In 
other words, each answer will be graded entirely on its own merits. 
It is hoped that the graders will make every effort to follow to the 
letter the instructions given below: 

1) You will find all of the answers to Question 1 typed on a set 
of individual slips. The entire set is labeled ‘‘ Answers to Ques- 
tion 1.”’ The answers to each of the remaining questions are 
arranged in the same way. 

2) Each slip bears the number of the question and a code num- 
ber which has been assigned to the writer of the questions. 
Please do not try to compare the code numbe7s on the answers to 
different questions. The codes have been purposely disguised, 
but they are not unbreakable. 

3) Before proceeding with the actual grading of the slips, sign 
your name in the space provided on the title slip of each of the 
five sets. Please do this first, so that there will be no possibility 
of forgetting to do it. 

4) When grading, you should complete the work on the answers 
to Question 1 before going on to the answers to Question 2. 
Complete the second set before going on to the third set, etc. 

5) Use a percentage system for grading each set of answers. 
That is, a completely worthless answer would be graded 0, a per- 
fect answer would be graded 100. 

6) Consult no one with respect to the standards for grading. 
In order to get completely unambiguous results from the present 
study it is of the first importance that the grade given each answer 
shall be the result of a wholly independent judgment. Please 
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do not compare your grades with those of any other grader at the 
present time. 

7) In order to get the finest possible discrimination among 
answers, the following technique is suggested: 

(a) Arrange in order of merit the slips containing the answers 
to a given question. A good way to do this is to select 
the most nearly perfect answer and the worst answer and 
then work in toward the middle. 

(b) After the answers have been arranged m order of merit, 
go through the set assigning grades to each of the answers 
in the series. If possible try to avoid giving any two 
answers the same grade. A difference of as little as one 
point may not be significant, but it will be helpful later 
on in handling the data. 








REGRESSION LINES FOR ESTIMATING 
INTELLIGENCE QUOTIENTS AND AMERICAN 


COUNCIL EXAMINATION SCORES 
DAVID F. VOTAW 


Southwest Texas State Teachers College 


The high school counselor is in need of a device by which he can 
utilize the intelligence quotient of a high-school student to 
estimate the score on the American Council Psychological 
Examination which the student will make subsequently when he 
reaches college. Conversely, the college counselor wishes fre- 
quently to estimate a college student’s IQ from the ACE score 
made by the student in college. 

Weber’ has provided a partial solution to these problems, 
but he has left out of account the fact that there is considerably 
less than a perfect positive correlation between the two variables. 
His bisector of the angle made by the two regression lines is used 
to convert from either variable to the other. The fallacy in this is 
identical to the one involved in the conclusion that a son will be 
the same height as his father or a father the same height as his son 
for the reason that a population of fathers and a population of 
sons have the same mean height and the same variability in height. 
Actually, because of regression, the son of a tall father tends to be 
a little shorter than the father but the father of a tall son tends to 
be a little shorter than the son. Both regression lines should be 
drawn and each used only for conversion in its appropriate 
direction. 

The writer offers here tentative results from a small sample of 
students. Although a larger sample was available, all of the 
testing of the seventy students of the sample used was done by 
the writer—both the Otis Group Intelligence Test when the 
subjects were in their first year of junior high school and the 
American Council Psychological Examination, six years later, 
when the subjects entered college. 

All of the students were from the San Marcos, Texas, High 
School and were admitted to the Southwest Texas State Teachers 
College at regular entrance dates between 1943 and 1945. The 


1 Weber, Edmund G., “‘Equating High-school Intelligence Quotients with 
College Aptitude Test Scores,” The Journal of Educational Psychology, 36, 
1945, pp. 443-446. 
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uniformity of these conditions may offset somewhat the smallness 
of the sample. The mean and standard deviation of the ACE 
scores for the sample do not differ significantly from those of the 
total population of the college. 

In the discussions which follow 


X refers to an Otis IQ 
Y refers to the companion ACE score. 


The seventy pairs of measures were tabulated in a scattergram 
from which the following results were computed: 


M, = 108.9 M, = 92.4 
o,= 10.0 oy = 19.8 
Tay = +.74 
PE, = +.04 


From these values the regression equations were then found to be 
Y = 1.47X — 67.2 


(This is the equation to use when the Otis IQ, X, is known 
and the ACE score, Y, is to be estimated. The probable 
error of a score estimated thus is +9.0) 


and 
X = .374Y + 74.4 


(This is the equation to use when the ACE score, Y, is known 
and the Otis IQ, X, is to be estimated. The probable error 
of an IQ estimated thus is +4.5) 


For practical utility, a line graph may be drawn for each equa- 
tion as shown in Fig. 1. 

The chart is to be used as follows: 

1) When the Junior High School Otis 1Q, X, is known and an 
estimate of the ACE score, Y, is desired, enter the chart from the 
top and deflect the glance to the left from the Y line to the vertical 
scale at the left. For example, if a junior high-school student has 
an Otis IQ of 134, he may be expected to make an ACE score of 
about 130 when he enters college. 

2) When the College ACE score, Y, is known and an estimate 
of the Otis IQ, X, is desired, enter the chart from the left and 
deflect the glance upward from the X line to the horizontal scale 
at the top. For example, if a college freshman has an ACE score 
of 130, he may be expected to have on his high-school record an 
Otis IQ of about 123, 
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NOTE ON A COMMENT ON THE ‘CORRECTION’ 
OF RELIABILITY COEFFICENTS FOR 
RESTRICTION OF RANGE 


OCTAVIO A. L. MARTINS 


Departamento Regional do SENAI 
Rio de Janeiro, Brasil 


In November 1944 issue of this JouRNAL Frederick B. Davis 
published a note’ on the prediction of reliability coefficients in 
samples for which there is some restriction in the range of a cor- 
related variable. The following formula was reached: 


T 46 + "14 my — 1 
Ras = = (1) 
1 + re, (2 ee 1) 


a7 





Subscripts 4 and 6 refer to equivalent halves of the test. 
Subscript 1 to the correlated variable. Capital letters refer to 
statistics of the ‘unrestricted’ sample and lower case letters to 
statistics of the ‘restricted’ sample (or vice versa). Together 
with the Spearman-Brown prophecy formula this would give the 
reliability coefficient of the whole test. 

In a comment on that paper in the November 1945 issue of this 
JOURNAL, Hyman B. Kaitz? derived a formula giving directly the 
reliability coefficient of the whole test. 


Tou + riok 


Ron = 1 4 rk ) (2) 





2 


in which k stands for (2 — 1) and subscripts 2 and II refer to 


oF 
comparable forms of the whole test. It can be seen that formulae 
(1) and (2) are identical but for the substitution of statistics of the 
comparable forms of the test for corresponding statistics of the 
equivalent halves of the same test. 

In both papers an interesting point was missed; There is no 
need of the Spearman-Brown formula to get Ren from (1), as 
there is no need of any mathematics for the derivation of (2) from 
(1). Itis obvious that any test may be conceived as one half of a 
182 
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hypothetical test twice as long, and so formula (2) can be written 
immediately from formula (1) on this assumption. 
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1) Davis, Frederick B. ‘‘A Note on Correcting Reliability Coeffi- 
cients for Range.” Journal of Educational Psychology, November 1944, 
Vol. xxxv, pp. 500-502. 

2) Kaitz, Hyman B. ‘“‘A Comment on the Correction of Reliability 
Coefficients for Restriction of Range.” Journal of Educational Psy- 
chology, November 1945, Vol. xxxv1, pp. 510-512. 
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BOOK REVIEWS 


JuLES H. MassermMan. Principles of Dynamic Psychiatry. 
Philadelphia: W. B. Saunders Co., 1946, pp. 321. 


There are currently a large number of theoretical formulations 
which are designed to systematize the varied observations of 
behavior. All such formulations can be readily classified into 
three types. One, exemplified historically by Watsonian 
behaviorism, places primary emphasis on the physiological 
factors in behavior. A second, best shown in the various psy- 
choanalytic schools, is frankly dualistic with greater importance 
being attributed to mentalistic factors. The third, represented 
in psychiatry by Meyerian psychobiology, considers behavior as 
an expression of the whole organism and not as either physical 
or mental part-reaction. The author of the present volume 
considers each of these with intelligent and sympathetic under- 
standing, but shows that none has been adequate to deal with 
behavior as observed. 

Masserman, whose writing give evidence of wide experience 
in the classroom, the clinic, the laboratory, and the library, 
essays in this volume to formulate a ‘“‘psychodynamic theory of 
normal and abnormal behavior of explicit epistomologic, heuristic, 
and operational validity.”’ It is this reviewer’s opinion that he 
has been eminently successful. 

The theory is most briefly stated in four principles—‘ (1) 
Principle of Motivation: Behavior is basically actuated by the 
physiologic needs of the organism and is directed toward the 
satisfaction of those needs. (2) Principle of Experiential 
Interpretation and Adaptation: Behavior is contingent upon, 
and adaptive to, the organism’s ‘interpretations’ of its total 
milieu, as based on its capacities and previous experiences. (3) 
Principle of Deviation and Substitution: Behavior patterns 
become deviated and fragmented under stress, and when further 
frustrated, tend toward substitutive satisfactions. (4) Principle 
of Conflict: When in a given milieu two or more motivations 
come into conflict in the sense that their accustomed consum- 
matory patterns become incompatible, kinetic tension (anxiety) 
mounts and behavior becomes hesitant, vacillating, erratic, and 
poorly adaptive (neurotic) or excessively substitutive, symbolic 
and regressive (psychotic).”’ 
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A reviewer cannot detail the sources and significance of these 
principles. However, the reader of the book is quickly aware 
that they rest upon a rich acquaintance with patients in the 
clinic, with experiments in the animal laboratory, and with the 
historical development of behavior theory. The author very 
modestly claims that the book has been written so that his stu- 
dents may have certain fundamentals in convenient form, and 
his time may be freed for the task of clinical teaching. Masser- 
man’s students can consider themselves fortunate indeed to be 
able to study with a clinician who has so sane and reasonable a 
theoretical point of view. This book is as important a contribu- 
tion to behavior theory as any which has yet appeared. Psycho- 
logists, and especially those concerned with clinical problems, 
cannot afford to miss it. C. M. Louttit 

Ohio State University 


BERTHOLD LOWENFELD. Braille and Talking Book Reading: 
A Comparative Study. New York: American Foundation 
for the Blind, 1945, pp. 53. 


The purpose of this investigation is to compare the effective- 
ness of braille and talking book reading with respect to speed and 
comprehension and to note preferences of blind children for one 
or the other mode of reading. Children in third, fourth, sixth 
and seventh grades were tested. Standard test lessons in read- 
ing were used and in the upper grades story and textbook mate- 
rials were added. In addition to braille and talking book reading, 
the talking book material was presented with sound effects and 
with dramatizations. 

Blind children read about one-third as fast in braille as by 
talking book. The braille rate of reading was much slower 
(one half to one fourth) than norms for silent reading of seeing 
children. In the lower grades of the blind, comprehension by 
talking book reading with or without sound effects was superior 
to braille reading. These differences were greatest for children 
with low I Q’s. In the upper grades, comprehension for story 
tests was the same in both types of reading. But for textbook 
materials, braille was superior to talking book reading. Most 
children preferred talking book reading with either dramatiza- 
tions or sound effects. 
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It is concluded that there should be improved instruction in 
braille with more emphasis upon speed and comprehension. The 
transition from uncontracted braille, which students learn first, 
to the second degree of contraction used in the upper grades 
should be introduced earlier and without the intermediate step. 
Recommendations are made for more extensive use of the talking 
book reading for supplementary reading at all levels. The use 
of talking book reading is particularly stressed for low intelligence 
children who profit little from braille instruction. 

Although this study has certain limitations in terms of the 
sample measured and test materials available for use, it was well 
planned and the data adequately analyzed. The conclusions 
are justified by the data and the implications are of high impor- 
tance for teachers of the blind. Recommendations with regard 
to braille reading and with regard to the use of talking book 
reading should receive sympathetic consideration. 

Mies A. TINKER 

University of Minnesota 


THE OccuPATIONAL OPPORTUNITY SERVICE, OnIO STATE UNI- 
versITY. Ohio State and Occupations. Columbus, Ohio: 
The Ohio State University Press, 1945, pp. 198. 


The aim of this small volume is to furnish the college student 
with information concerning the range of job opportunities 
available to graduates majoring in the various college depart- 
ments. Certainly, the complex and highly specialized conditions 
of modern life indicate the need for intelligent considera- 
tion of vocational opportunities prior to graduation. All too 
frequently, students choose their college curriculum or major 
on the basis of inadequate, irrelevant, or erroneous information, 
and after having made their choice, do not have clearly in mind 
the occupational potentialities of such a decision. It is to 
acquaint the Ohio State University student with these vocational 
implications that the current volume has been prepared by the 
Occupational Opportunity Service, with the codperation of the 
University faculty, under the supervision of Harold A. Edgerton, 
Director of the Service. Such information should be useful 
not only to college students, but also to college advisors, uni- 
versity administrators, occupational counselors, and others 
engaged in planning student-training and vocational careers. 
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Occupations appropriate to majors in sixty-nine departments 
are listed, arranged alphabetically by departments and by jobs 
within each department. The number of jobs, or vocational 
opportunities, covered ranges from less than five, as in the case 
of the Petroleum Engineering, Public Relations, or Real Estate 
Departments, to sixty-eight in the case of the Education Depart- 
ment. In several instances, one or more paragraphs are initially 
devoted to such overall considerations as employment oppor- 
tunities, level of earnings, and general conditions of work. 
In the case of certain departmental majors having a large number 
of vocational opportunities, the grouping and inter-relationships 
between various job possibilities are presented in diagramatic 
form. The Table of Contents lists the departmental majors, 
and the job possibilities are presented alphabetically in the 
Index at the end of the book. 

The chief inadequacies of the present volume appear to lie 
in its omissions, inconsistent and uneven treatment, and, in a 
few instances, questionable information. In some cases (e.g. 
occupations appropriate to graduates of the Department of 
Botany), organization of material is quite atypical, the job 
descriptions being extremely brief and sketchy. Opportunities 
for work in the Federal Government are sometimes presented, 
as in occupations appropriate to majors in the Department of 
Chemistry, whereas this pertinent information is omitted in 
many of the other departments considered. The listing of 
several companies employing research pharmacologists is incom- 
plete and inconsistent with the treatment in the rest of the 
volume, and is likely to give the naive student the idea that 
these are the only companies employing research workers in this 
capacity. Similarly, information concerning the specific train- 
ing requirements and probable levels of earnings is omitted in a 
large number of cases. 

Readers of this JouRNAL will be particularly interested in the 
listing appropriate to psychology majors. Here again the 
reviewer would raise a number of questions. The treatment 
of the activities of the Clinical Psychologist appears somewhat 
over-emphasized and out of proportion to the degree of emphasis 
placed on other psychological work. Despite the obvious 
fact that clinical work demands both well-rounded and advanced 
training, no required or desirable level of training is indicated 
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for work in this field, in contrast to psychological work in other 
areas. The level of training, ‘Master’s degree preferable,’ 
- for the Industrial Psychologist would likewise appear to be some- 
what of an understatement in the light of other indicated levels 
4 of education. Similarly, the Master’s degree is held to be 
iy required for the job Psychometrist who ‘‘administers a variety 
7 of psychological tests under supervision,”’ whereas only one year 
‘¥ of specialized training beyond the A.B. is said to qualify one to 
| work as a Remedial Specialist, who ‘diagnoses and treats 
persistent and severe scholastic difficulties at any level” and who 
f ‘‘must recognize and be able to deal effectively with the correlated 
s i emotional and personality difficulties.’”” The student also learns 
) dh that the Personnel Psychologist (Educational) occasionally 
i¢ ‘acts as a private consultant,’ although no mention is made of 
this activity on the part of the Industrial Psychologist or of the 
a Personnel Psychologist (Business or Industrial). Finally, there 
is a need for a more careful editing of the printed manuscript. 
Notwithstanding the above criticisms, the present volume 
represents a useful source of information to students and counse- 
lors alike. Its chief contribution, in comparison with such a 
1% work as Williamson’s wellknown Students and Occupations, is 
ii that of greater coverage. In those areas where overlap does 
occur, however, the Ohio State treatment is somewhat more 
superficial. JoHN P. Fo.ey, JR. 




















Industrial Division, The Psychological Corporation, New 
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RupoLtF PINTNER AND ARTHUR I. Gates. The Value of 
Individual Hearing Aids for Hard-of-Hearing Children. 
Washington, D. C.: National Research Council, 1944, pp. 40. 








Hearing aids improve personality adjustments of children as 
reported by parents. But there is no evidence that speech, 
achievement, or change in score on personality inventories are 
significantly influenced. These are the general conclusions of a 
study of fifty-two hard-of-hearing children attending regular 
public schools in Greater New York and in Jersey City who were 
provided with hearing aids and compared with a control group 
as far as possible. Observation was made over a period of 
twenty-six months. 
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This investigation was originally planned by Dr. Rudolf 
Pintner. After Dr. Pintner’s death the research project was 
continued under the directorship of Dr. A. I. Gates. 

H. ME.LTzER 


Psychological Service Center, St. Louis, Missouri 


SISTER ROBERTINE WEIDEN, The Effect of Checked Directed Study 
Upon Achievement in Ninth-grade Algebra. The Johns 
Hopkins University Studies in Education, No. 34. Balti- 
more: Johns Hopkins Press, 1945, pp. 85. 


An attempt is made to determine the effect on achievement of 
administering daily check tests at the end of a directed study 
period in ninth-grade algebra. There were four matched groups 
of students: (1) The control group received no check test at end 
of period. (2) The posted group had scores of check test listed 
on the blackboard each day. (3) The annotated group had 
check tests returned each day with annotations on errors, correct 
solutions listed, etc. (4) The remedial group had corrected 
check tests returned each day and then devoted part of period to 
remedial work. At the end of twelve weeks the control group 
exchanged its procedure with the remedial group, and the posted 
group with the annotated group. Initial and final tests were 
given in each twelve-week period. Gains were also checked to 
determine whether any method was best adapted to high, average 
and low mental ability. 

When all checked groups were combined there was no signi- 
ficant gain in comparison with the control group for either 
twelve-week period. During the first twelve-week period, the 
only procedure which yeilded superiority over the control group 
was the annotating method. There were no significant differ- 
ences between the three checking procedures. In the second 
twelve-week experiment, with the methods interchanged, the 
remedial method proved superior to the control (non-checking) 
and to annotating of the test papers. The pupils preferred the 
check tests over no tests. No checking method was superior to 
another for any particular level of intelligence during the first 
twelve weeks, but the remedial procedure was found best for 
pupils of average ability during the last twelve weeks. This 
conclusion concerning the method best for a given level of intel- 
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ligence is of little value, since the groups compared were small 
(eight to twenty-two pupils). 

The shift in trends which occurred with interchange in methods 
raises a question concerning the adequacy of the experimental 
design. The author recognizes to some degree this difficulty. 
There is no way of knowing to what degree the work methods 
developed during the first twelve weeks carried over to the 
second period, or the degree of additional stimulation that comes 
from a mere change in method irrespective of what the new 
method is. For instance, the teachers reported that, when the 
control group during the first twelve-week period shifted to the 
remedial method in the second period, the pupils were much 
pleased to be checked daily and were stimulated to work much 
harder. It would seem that further research, such as suggested 
by the author, is needed to attain an unequivocal answer con- 
cerning the relative advantages of these teaching methods. 

Mies A. TINKER 

University of Minnesota 
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